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SOME INEQUALITIES FOR CENTRAL MOMENTS OF 

MATRICES 

zoltAn leka 


Abstract. In this paper we shall study noncommutative central moment 
inequalities with a main focus on whether the commutative bounds are tight 
in the noncommutative case, or not. We prove that the answer is affirmative 
for the fourth central moment and several particular results are given in the 
general case. As an application, we shall present some lower estimates of the 
spread of Hermitian and normal matrices as well. 


1. Introduction 


Let X be a random variable on a probability space (f l,P). Then its p th (frac¬ 
tional) central moments are defined by the formula 

v 


H P {X)= [ X - [ XdP 
Jq Jo. 


dP. 


The most studied noncommutative analogue of these quantities is the noncommu¬ 
tative variance or quantum variance. Let M n ( C) be the algebra ofnxn complex 
matrices. Whenever >I>: M„(C) —> M m (C) is a positive unital linear map, the vari¬ 
ance of a matrix A can be defined as <f>(A*A) — $(A)*>f>(A). For several interesting 
properties of this variance, we refer the reader to Bhatia’s book [4], For instance, 
special choices of $ and applications of variance estimates provided simple new 
proofs of spread estimates of normal and Hermitian matrices as well, see [5j and 
[Bj. On the other hand, the first sharp estimate of the noncommutative variance 
appeared in K. Audenaert’s paper [T] in connection with the Bottcher-Wenzel com¬ 
mutator estimate. For several different proofs of his result, we refer to 0, Ej and 
[23|. Recently, extremal properties of the quantum variance were studied in [20]. 

It is simple to see that if ui is a state (i.e. positive linear functional of norm 1) 
of the algebra M„(C), then one has the upper bound 

w (l^ — w(A)| 2 ) = w(|A| 2 ) — \w{A)\ 2 < ||A|| 2 

(see [5] Theorem 3.1] for positive linear maps). A careful look of the previous 
inequality says that the noncommutative variance cannot be larger than the ordi¬ 
nary variance of random variables. In fact, if A' is a Bernoulli variable, that is, 
P(X = 0) = p and P{X = 1) = 1 — p (0 < p < 1), then ^{X) < 1/4 holds. 
Furthermore, for any (complex-valued) random variable Z: Q —>■ C the inequality 

V P2(Z) < 2 max{ y/p 2 (X): X Bernoulli random variable IHZH^ 

= Halloo 

readily follows, see [TJ Theorem 7] in the discrete case and [141 Theorem 2] in the 
general case, for instance. Furthermore, one can have the following upper bound for 
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the n th (n £ N) central moment of a normal element A in matrix and C*-algebras 
%fu>(\A — w(A)| n ) < 2max{ ^/Ji n {X ): X Bernoulli random variable } min ||A — A||, 

see | 14L Theorem 2]. 

Our main motivation is to provide sharp upper bounds on the non-commutative 
central moments of arbitrary matrices and to decide whether the noncommutative 
dispersion can be larger than that of the commutative one. Now we are able to 
tackle the problem only for 1 < p < 2, p = 4 and for the tracial state if 1 < 
p < oo. For some recent results on the complementary upper bounds on fourth 
central moments of matrices, the reader might see the paper [23]. We note that 
a very similar phenomena was observed by K. Audenaert [2|. He proved that the 
asymmetry of the quantum relative entropy essentially cannot be larger than that 
of two Bernoulli distributions. 

The paper is organized as follows. In the next section we prove an inequality 
for the fourth moments of partial isometries and positive linear maps given by unit 
vectors. To set free these assumptions, we shall apply a dilation method and a 
rank-estimate on the extreme points of convex sets of density matrices. In the 
last section of the paper, we shall produce some general results on p th moments 
of matrices, determined by the tracial state. As an application, we shall present 
several lower bounds for the spread of normal and Hermitian matrices. 

2. General moment inequalities 

2.1. A moment estimate of partial isometries. Let X be a Bernoulli random 
variable with parameter p. Then one has clearly the inequality pa(X) < while 
/m(-Z) < | min^ 6 c max* \zi — A | 4 comes true for any finite-valued random variable 
Z. From a geometric point of view, the quantity min^ g c max^ \zi~ A| is the radius of 
the smallest enclosing circle of the values of Z. For several inequalities in connection 
with it and vector norms, the reader might see jl , Section 4], 

Our first result gives the corresponding noncommutative moment estimate for 
partial isometries. Recall that an n x n matrix V is partial isometry if V is an 
isometry on the orthogonal complement of its kernel. A useful characterization 
says that V is a partial isometry if and only if V*V is an orthogonal projection (to 
the subspace (ker V r )‘ L ), or, which is the same, V = VV*V (see [TT], [2TJ page 95]). 
Hence V* is a partial isometry and VV* is an orthogonal projection as well. 

We start with a technical lemma. 

Lemma 1. Let xi, x 2 , £ 3 , X 4 £ R. Then 

max { 2 x\xz — 2 a;i:r 4 + x\) = 1 

X 

subject to the constraints 0 < X 3 < X 2 < 1 and x± < X 4 + yql — xffix^—x^). 

Proof. With a change of variables 7/2 = \J 1 — y% = \/x\ — x\ and yi = xi, 1/4 = 
X 4 , we have 2 xfcc 3 — 2 a.’iX 4 + a;| —1 = 2yly / l — y\ — y\ — 2y\y±—y\. Notice that the 
last function is convex in y \, hence it attains its maximum when y\ is the largest, 
i.e. yi = 1/4 + J/ 2 J/ 3 ■ Therefore, it is enough to prove the general statement 

max G(yi,y 2 , 2 / 3 , 2 / 4 ) = max 2 y\^J\- y\ - y\ - 2//12/4 — 2/1 = 0 

s-t. 2/1 = 2/4 + 2/22/3- 

First, we compute the extrema of G when ( 2 / 2 ,2/3 > 2 / 4 ) is m the open cylinder 
D x I, where D denotes the open unit disk of the plane. To do this, let us find 
the constrained critical points of the Lagrangian C(y,X) = G(y) — Xc(y), where 
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the constraint function is c{y) = y\ — 1/4 — 1 / 22 / 3 - A little calculation gives for the 
gradient equation S7C(y,X) = 0 that 


-A - 22/4 + Ayi^Jl — y\ - y\ = 0 
2//?2/2 


A 2/3 - 


V 1 -vl~ vi 

A 2/2 - 2 ^ 3 


- 22/2 = 0 
= 0 


V 1 -2/f- 2/3 

A - 22/1 = 0 
-2/i + 2/22/3 + 2/4 = 0. 


To solve this system, note that if 2/1 = A/2 = 0 then 2/2 = 2/4 = 0 and — 1 < 2/3 < 
1. From A = 2/4 = 0, it is simple to check that G < 0. Indeed, 2/1 = 2/22/3 and 

22/22/3 /l~ 2/1 -2/1 — 2/1 < 0, 

because of the inequalities 

2 2/3\A 2/i 2/3 < 22/3^1-2/3 < 2 a/ 2/|(l — 2/1) < 1- 

On the other hand, if A ^ 0, from the third and fourth equation 

22/2/1 - 2/1 - 2 /f = Aj/ 3 . 

Substitute this to the second one and we obtain that 

2 / 2(1 - 2/1 - 2 / 3 ) = 2 /i 2/2 + 2 / 2/1 - i/I - 2 / 3 - 


Since 1 - 2/2 - 2 /i < 2 /i + v/ 1 2/1 2 /f> if (2/2,2/3) S B> and 2/1 7 ^ 0, it follows that 

2/2 = 0. Clearly, 2/2 = 2/3 = 0 and 2/1 = 2/4 = A/2 hold. The corresponding Hessian 
of £ at a stationary point ( 2 /*, A*) = (t, 0, 0, t, 2t) is 




4 0 0 -2 

0 —2(t 2 +1) 2f 0 

0 2 1 —2t 2 0 

-2 0 0 0 


Now let us consider an 2 / 4 *-sections of the cylinder ID x R; that is, add the con¬ 
straint 2/4 = t 0) to the optimization problem. Then 02(2/) = 2/4 — t an< l 
Vc 2 (2/*) = [0,0,0,1]*. Note that Vc(j/») = [1,0,0,—1]*, hence the tangent plane 
of the constraints at 2/* is 


T(A*) := (u>: w*X7c(y *) = 0 and w*\7c2(y *) = 0} = {[0, w 2 , W 3 , 0[* : Wi £ R}. 


Furthermore, we obtain 

w*S7 yy C(y*,\*)w = —2 t 2 w\ — 2{w 2 — tw 3 ) 2 <0, 0 ^ w £ T(A*). 

Thus ?/* is a strict local maximum of G subject to c and C 2 , see [TUI Theorem 12.6], 
and G{y*) = 0. 

For 2/4 7 ^ 0, all 2 / 4 -sections of the cylinder ID x R contain exactly one local maxi¬ 
mum point, hence 0 is the global maximum of G on its domain (s.t. the constraint). 

□ 


Theorem 1. Let V be a partial isometry in M n ( C). Let Q £ M n (C) be a rank-one 
orthogonal projection. Then 

Tr [Q\V - Tr [QF]| 4 ] < | 

Moreover, if the equality holds then \V\Q = Q|Vj. 
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Proof. Without loss of generality, we can assume that a := Tr QV = Tr QV* > 0. 
Let V = UP be a polar decomposition of V. where P = V*V is an orthogonal 
projection and U is unitary. Choose a unit vector x such that Q = x*x. First, one 
has that 

Tr [Q\V - al | 4 ] = Tr [Q(V*V - aV*V 2 - aV*VV* + a 2 V*V 

- aVV*V + a 2 V 2 + a 2 VV* - a 3 V 

- aV* 2 V + a 2 V*V + a 2 V* 2 - a 3 V* 

+ a 2 V*V - a 3 V - a 3 V* + a 4 /)] 

and applying the identities V*VV* = V* and VV*V = V, 

= \\Px\\ 2 - 2aRe{V*V 2 x,x) + a 2 (3\\V*Vx\\ 2 + ||yR*x|| 2 - 2) 

+ 2a 2 Re(F 2 x, x) — 3a 4 
and since ||y*Rx|| 2 + ||Ry*x|| 2 < 2, 

< ||.Pcc|| 2 — 2aRe(yx, Px) + 2a 2 ||Px|| 2 + 2a 2 |Re(R 2 x, x)\ — 3a 4 . 

Next, from the Cauchy-Schwarz inequality 

|Re(Vx, R*x}| = |Re (PUPx, PU*x}\ < \\PUPx\\\\PU*x\\ < \\PUPx\\. 
Therefore, we obtain 

Tr [Q\V — a/| 4 ] < \\Px\\ 2 + 2a 2 \\Px\\ 2 - 3a 4 - 2aRe {UPx, Px) + 2a 2 \\PUPx\\ 

< max 1 + 2a 2 — 3a 4 

0<a=<l 

+ max(||Px|| 2 — 1 — 2aRe ( UPx, Px) + 2a 2 ||Pt/Px||). 

It is simple to check that 

max 1 + 2a 2 — 3a 4 = ^. 

0<a<l 3 

For the remaining part of the previous inequality, from the Cauchy-Schwarz in¬ 
equality we have the following constraint for 0 < a 

a = Re {UPx, x) 

= Re {PUPx, Px) + Re (P x t/Px, P ± x) 

< Re {UPx,Px) + ||P _L I/P:r||||P _L x|| 

= Re {UPx, Px) + (||PPx|| 2 - ||PC/Px|| 2 ) 1/2 (l - ||Px|| 2 ) 1/2 
= Re {UPx, Px) + (||Px|| 2 - ||PC/Px|| 2 ) 1/2 (l - ||Px|| 2 ) 1/2 . 

Hence we can apply Lemma 1 to obtain that 

2a 2 ||PPPx|| - 2aRe {UPx,Px) + \\Px\\ 2 < 1. 

Thus the inequality 

Tr [Q\V - al\ 4 ] < ^ 

follows. 

Note that when the equality occurs ||Px|| 2 = 1 must hold. This means that 
Px = x, which is the same as x € ran P, or, QV*V = VV*Q. □ 
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Example 1. We give a matrix example, which is not normal, to see that the 
previous inequality is tight. Set the partial isometry 



1 

1 

1 

0 



1 0 

- 1/2 1 

- 1/2 -1 

0 1 


and define 


Q 


10 0 0 
0 0 0 0 
0 0 0 0 
0 0 0 0 


In fact, V is a partial isometry because V*V is an orthogonal projection. More¬ 
over, the spectrum of V is a(V) = {1,1 /x/3,0, —1}, hence min^ 6 c \\V — A/„|| = 1. 
Furthermore, it is simple to see that 


TV[Q|V-T¥[QE]| 4 ] = ! 


2.2. Convex sets of density matrices. Let A\,A 2 , ..., Ak £ M„(C) be Herrnit- 
ian matrices and let or, 02 , • ■ •, oik be real numbers. Let us consider the convex, 
compact set 

V{A u A 2 , ... ,A h ) := {X > 0: Tr X = 1 and Tr [XA t ] = a u i = 1,2,..., k} . 

Note that V(A\, A 2 ,..., Ak) = T>(Ai — a\I, A 2 — a 2 I ,..., Ak — a*,/). The geom¬ 
etry of V{A\ 1 A 2 ,... ,Ak) is strongly related to that of the elliptope; i.e. the set 
of real n x n symmetric positive semidefinite matrices with an all-one diagonal 
(briefly, correlation matrices) [Rj, [f). Chapter 31.5]. Additionally, we used the 
set T>(A\, A 2 ,..., Ak) to provide a description of the extreme non-commutative 
covariance matrices associated to Hermitian tuples (see mi 

We recall that whenever D is an extreme point of T){A\, A 2 ,..., Ak), one has 
the rank estimate [12 Corollary 1| 

(1) rank D < Vk + 1. 

We remark that the proof of the previous inequality is closely related to a method 
invented by C.K. Li and B.S. Tam m in order to describe the extreme correlation 
matrices. 

Turning back to moment inequalities, Audenaert’s theorem [I] on the (quan¬ 
tum) standard deviation states that for any A £ M„(C) there exists a rank-one 
orthogonal projection P such that 

D> max =i (Tr [D\A - Tr /DA]! 2 ]) 1 / 2 = (Tr [P\A - Tr [PA]]} 2 ) 1 ' 2 = min \\A - A/||. 

This result was proved directly in [7] and in 0 for C *-algebras by means of a char¬ 
acterization of the Birkhoff-James orthogonality in matrix and operator algebras, 
respectively. 

Throughout the paper we say that an n x n matrix D is a density if D is positive 
semidefinite and Tr D = 1. Exploiting the aforementioned rank estimate, now we 
can prove the following 

Theorem 2. Let D £ M n ( C) be a density. For any 1 < p < 00 and A £ M n (C), 
there exists a rank-one orthogonal projection P £ M n (C) such that 

Tr [D\A - Tr [DA}\ P ] = Tr [P\A - Tr [PA]\ p ]. 
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Proof. Without loss of generality we can assume that Tr [DA\ = a is real, hence 
A + A* 

Tr D -= a holds as well. Let us introduce the convex set 

2 

£>(1,4 - al n \ p , ■= {X density : Tr [X\A - al n \ p } = Tr [D\A - al n \ p } 

and Tr = a}. 

2 J 

Obviously, V(\A — al n \ p , A+A ) is non-empty. Relying upon the inequality (1), the 
rank of extreme points of T> is at most y/3. Since any rank-1 density is an orthogonal 
projection, the proof is complete. □ 

Now we can prove the main theorem of the section. 

2.3. A 4-order moment estimate. 


Theorem 3. Let A £ M n (C) and let D £ M n ( C) be a density matrix. Then 


Tr [D\A - Tr [DA]\ a ] < - min \\A - A/J 4 . 

Proof. Without loss of generality, we can assume that ||A|| = 1 holds. Now form 
the partial isometry (see |TJ| 1 

A (I — AA*) 1 / 2 ' 

0 0 


V = 


€ Af2n(C). 


Let 


D := 


D 0 
0 0 


which is a density matrix, of course. Then a straightforward calculation gives that 

'\A-Ty[DA ]\ 4 + X * 


|V-Tr[W]| 4 = 


where X = {A- Tr [DA])* {I - AA*)(A - Tr [DA]) > 0, hence 


\A — Tr [ZL4]| 4 < 


I 0 
0 0 


\V-Tr[DV ]\ 4 


I 0 
0 0 


Therefore the inequality 

Tr [D\A - Tr [DA]I n | 4 ] < Tr [D\V - Tr [DV]I 2n | 4 ] 

follows. Additionally, relying on Theorem 2, one can assume that D = xx* is a rank- 
one orthogonal projection with some unit vector x. Then Theorem 1 immediately 
gives that 

Tr [D\A - Tr[DA]| 4 ] < Tr [D\V - Tr[f>R]| 4 ] 

4, 


< 


= 3 ^“- 

Changing A to A — XI, we get the proof of the statement. 


□ 


Surprisingly, the next example shows that if A is not normal the previous upper 
bound is not necessarily sharp. Please, compare it with Example 1 and Audenaert’s 
theorem on the noncommutative variance. 

Example 2. Let A denote the Jordan block ^ ^ 


. We calculate the value of 


Hi(A) := max {Tr [D\A - Tr [DA] | 4 ]: 0 ^ D £ M 2 (C) and Tr D = 1}. 
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Indeed, from Theorem 2 we can find a projection P = zz *, z* = \z\ Z 2 ] £ C 2 and 

\zi\ 2 + \z 2\ 2 = 1 , such that 

/ 14 (H) =Tr[P|H-Tr[PH]| 4 ]. 

Then a little computation gives that 
Tr [P\A - TA [PA] | 4 ] =|*i| 6 N 4 + M 4 M 2 + M 2 N 4 

— 21 2:1 1 2 12 : 21 2 ( 212 ri | 2 1 ^ 21 2 + 1 ) + |~ 2 | 2 (1 + |- 2 i|‘'|z 2 | 2 )” 
=4M 6 -3M 8 =:p(N). 

Note that max| ;S 2 |< 1 p(|, 2 : 2 |) = p( 1) = 1. Furthermore, it is simple to check that 
minAeC ||H — A|| = ||H — / 2 1| = 1 . Hence we get that 

/ 14 (H) = min ||H — All 4 . 

AeC 

However, from m Theorem 4] we have 

LM( A ) = ^min||H- A || 4 

for any normal A. 

Here we make a direct application of Theorem 3 to obtain a lower bound for 
the spread of normal and Hermitian matrices. If A is an n x n matrix and A;(H) 
(1 < i < n) denote its eigenvalues then the spread of A is 

spd(H) = max |Ai(H) — Aj(H)|. 

Spread estimates were initiated by L. Mirsky in his seminal papers DU and PS- 
After that several author provided upper and lower bounds for it, see m, ei, m 
and PS, for instance, and the references therein. 

For a normal A the spectral theorem gives that min^ e c ||H — A/|| = ta, where ta 
denotes the radius of the smallest disk that contains the eigenvalues of A. Jung’s 
theorem on the plane says that if P is a finite set of points of diameter d then P 
must be contained in a closed disk of radius d/y/ 3, see ES Chapter 16]. Hence, for 
any normal A, 

( 2 ) min ||H - XI\\ < -^=spd(H) 

(see [12 p. 1567-1568]). 

Corollary 1. Let A £ M„(C) be a normal matrix. Then 
Tr [D\A — Tr [DH]| 4 ] < ■^■spd(H) 4 . 

Moreover, if A is Hermitian then 

Tr [D\A - Tr [ZT4]| 4 ] < 

For a different proof of the last statement, the reader might see PI p. 169 
Remark], [23] Theorem 3] and [131 Theorem 2] for the normal case in Theorem 3. 

We recall that the quantity A(H) = min^ e c ||H — A/|| appeared in Stampfli’s 
well-known result m for the derivation norm 

2A(H) = max ||HA —AH||, 
ll*ll=i 

while the diameter of the unitary orbit of A is given by the formula 
2A(H) = max{||H — UAU* ||: U is unitary}, 
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see [7]- Hence any lower estimate of A (A) in terms of central moments might have 
its own interest. In the case of the noncommutative variance, this method was first 
exploited by R. Bhatia and R. Sharma in a series of papers Choosing differ¬ 

ent density matrices in the variance inequality, they got several interesting inequal¬ 
ities for A (A) and the spread of A, as well. Briefly, the idea of non-commutative 
variance estimates turned out be fruitful and led to simple new proofs of known 
spread estimates, including Mirsky’s and Barnes-Hoffman’s lower bounds (see [5] 
for details). 


2.4. Remark. Let w be a positive linear functional of M n (C). Then the map A K>• 
w(|A| p ) 1//p ,p ^ 2, is not a norm on M n ( C), because the triangle inequality fails, in 
general. However, the monotonicity statement 

uj(\A\ p ) l t p < uj(\A\ p ') 1 / p ' 

clearly holds for all 1 < p < p' < oo. In fact, w(A) = Tr DA with some D > 0 and 
Tr D = 1. Furthermore, we can assume that \A\ = {A*A) 1 / 2 is diagonal, hence the 
discrete Holder-inequality gives that 

/ n \ 1 / p / n \ 1 / p ' 

( J2 diiaP ii) ~ ) 

which is exactly what we need. Moreover, 

W (|H - w(H)| 2 ) 1 / 2 = (w(|A| 2 ) - | W (A)| 2 ) 1/2 < ^(l^l 2 ) 1 / 2 < \\A\\. 

Therefore, we get for any 1 < p < 2 and A € M n (C) that 
u(\A-uj(A)\ p ) 1/p < nun \\A — A||. 

Note that a simple calculus gives that 

2max{(E|A — E(A)| P ) 1 ^ P : X Bernoulli random variable } = 1 


if 1 < p < 2, hence the commutative bound turns out to be a tight bound in the 


non-commutative case as well. Actually, set A = 


and P = 


0 0 
0 1 


. Then 


Tr [P\A - Tr [PA]\ p ] = 1 = min \\A - A||. 


2.5. Remark. It would be interesting to know whether 4>: M n (C) —> Mk(C) is a 
positive unital linear map then the inequality 

^(lA-^l^/^lminllA-AJII 

holds. The corresponding result for the noncommutative standard deviation was 
proved in |5J Theorem 3.1]. 


3. Moment inequalities for the tracial state 

We start this section with a moment estimate of matrices, determined by the 
tracial state. 
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3.1. Central moments for the tracial state. Let A denote aiinxn matrix and 
let us define its Schatten p-norm 

\\A\\ p = (Tr\An 1/p , 

where 1 < p < oo and |A| = (A*A) 1 / 2 by definition. Then || • || p is a norm on 
M n (C). We recall that the duality formula 

P|| p = max{|TV[B*A]|: ||B||, < 1} 

holds with 1/p + 1/q = 1 ([21 Theorem 7.1]). 

Let X be a (real) discrete random variable on a finite set {1,. .., n}. We recall 
that E(|X — E(X)| P ) is the largest if the underlying probability measure is concen¬ 
trated on at most two atoms (for instance, see the proof [TU p. 168]). Moreover, 
ap^ p (X) = a.\iJ v (X — j3), for any real a and /?, thus 

E(|X - Epon 1 /? < b'/r max | X(i) - X(j )|, 

i,3 

where b p = maxo<*<i t p ( 1 — t) + t( 1 — t) p . 

Lemma 2. Let A £ M n (C) be a normal matrix and let 1 < p < oo. Then 

(Tr D\A — Tr DA\ p ) 1/p < 2 b l J p min \\A- XI n \\ 

p Aec 

holds, where b p denotes the largest p th central moment of the Bernoulli distribution. 


Proof. By means of a diagonalization and the previous remarks, for any Hermitian 
H and density D one has that 

(Tr [D\H-Tr DH\ p ]) 1/p < bl /p diam a{H) = 2&p /p min || H - A||. 

For a normal A, minAec 11^4 — A/ ra || equals to the radius of the smallest enclosing 
circle of <j(A). Without loss of generality, we can assume that the center of this 
circle is at the origin. Let us write A as a diagonal matrix A = Xl"=i ^iPii where 
Ai-s are the eigenvalues of A and Pi -s are orthogonal projections. Set the diagonal 
matrices 



Ai 


Ai| 

A = 

^n 

and H = 

|An| 


— Ai 


-|Ai| 


An. 


1 An |_ 


in M 2 „(C). Clearly, minAec ||-A — A|| = min>, e c ||if — A||. Moreover, we obtain with 
D = D © 0 € M 2n (C) that 

1/ 1/p 

Tr [D\A - Tr DA\ P ] /p = Tr D\A- Tr DA\ P 


= Tr 

= Tr 

and since H is Hermitian 


D ( 51 |AiH(Pi © ~Pi) - Tr [D(Pi © -Pi)]\ p 

i/p 


\i =1 

D\H- Tv DH\ P 


i/p 


< Tb\J p min \\H — AM 
p Aec 

= Tb\j p min ||A - A||, 
p xcir " 












10 


zoltAn leka 


which completes the proof. 


□ 


Theorem 4. Let A £ M„(C) and let 1 < p < oo. Then 

p\ !/p 

J <2 b l /p min\\A- \I n \\ 
holds, where b p denotes the largest p th central moment of the Bernoulli distribution. 


\ 

1 

-Tr 

A - -Tr A 

\n 

n 


Proof. Without loss of generality one can assume that A is a contraction, i.e. || A|| = 
1. From the singular value decomposition of A, one can find two unitaries U\ and 
U 2 such that 

A =\v l + \v> 

(see [2J p. 62-63], for instance). The convexity of the Schatten p- norms and the 
central moment estimates of normal matrices in Lemma 2 imply that 


(1 

1 

p- N 

\' 1/1 

-Tr 

A - -Tr A 


1 < - -Tr 

\n 

n 

J 

~ 2\n 


J7i - -Tr Ui 

n 




P\ 1/P 

P\ !/P 


U 2 - -Tr U 2 
n 


+bi /p \\u 2 \\ 
= 2by p \\A\\. 


Changing A to A — XI we get the proof of the statement. 


□ 


An application of Holder’s inequality gives that the function p 1 — > VJ V is mono¬ 
tone increasing on R + , hence limp^oo \yj v = b^ = 1. Similarly, || • || p —> || • || follows 
for the Schatten p-norms, if p —> 00 . Therefore, we obtain with (2) at hand that 


Corollary 2. Let A £ M n (C) be a normal matrix. Then 


Vs 

2 


A — — TrA 
n 


< spd(A). 


Moreover, if A is Hermitian then 


A - -Tr A 
n 


< spd(A). 


3.2. Central moments of matrix elements. In this section, we make some 
estimates of the moments of matrix elements. 

A conditional expectation operator Eig is an orthogonal projection from the 
matrix algebra M n ( C), endowed with the Hilbert-Schmidt inner product (A, B) = 
Tr B* A, onto the *-subalgebra *8 (see [2J Section 4.3]). Here we collect a few basic 
properties of the conditional expectation operator. First, we recall that for any 
A £ M„(C), 

Tr A = Tr E<g(A). 

Moreover, for each B £ *8, it follows the module properties 

E (g (HA) = HEq 3 (A) and E^ 3 (AH) = E<g(A)H. 

A useful property here is that the conditional expectation operators can be uni¬ 
formly approximated by the convex sums of the unitary conjugates of A. That is, 
for all e > 0, there exist unitary operators Ui,..., U m in the connnutant algebra of 
*8 such that 

m 

I|E<b(A) — ^2 XjUJAUj\\ < e, 
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EJLi \i = 1 and 0 < Ai,..., X m < 1. For a proof of these statements, we refer the 
reader to [HI Theorem 4.13]. 

The following proposition might be known in the literature, however, we were 
unable to find any reference. 

Proposition 1. Let £ be a unital *-subalgebra of M n ( C) and let E<r be the condi¬ 
tional expectation operator onto Cl. Then 

TV |E C (H)| P < Tr \A\ P , 

for every 1 < p < oo. 

Proof. The duality formula tells us 

(Tr |E e (A)|P)V*’ = max {|TV [BE £ (A )]\: \\B\\ q < 1} 

BeM n (C) 

holds where 1/p + l/q = 1. Furthermore, for any e > 0, there exist unitary matrices 
W \,..., W m such that 

m 

||Ee:(A) — ^^XjWjAWjW < e 
l=i 


and Xj = 1 (Xj > 0). Hence 


(Tr |E c M)| p ) 1 / p = max 
BGM„(C) 


{ 1 > [X> j BW*AW j 


l=i 


\ B \\q < 1 ' 


< 


< 


} + 0(e) 

max {5Z A il Tr [BW;AWj]\: \\B\\ q < l} + 0(e), 

EM n € ^ z ' ) 

3 =1 

z 

E A i B {l Tr ■■ \\B\U < l} + O(e), 


i=i 


and applying again the duality formula, 

m 

= J2^\W;AW j n 1 /P + 0(e) 

1=1 

= (Tr |H| p ) 1/p + 0(e), 
which is what we intended to have. 

Now we can prove 


□ 


Theorem 5. Let ei,..., e n be an orthonormal basis of C n . For any A £ M n (C) 
and 1 < p < oo, 

i /p 


( \ /p 

-Y i \(Aei,e i )--Y i (Ae j ,e j )\ p ) < (-Tr 

n n / \n 

i=l 3=1 ) 


A — —Tr A 
n 


i/p 


< Tb x J v min \ \A — XI n \\. 


cec 


Proof Let £ denote the commutative unital *-algebra generated by the orthogonal 
projections e^e* (1 < i < n). From the previous proposition one obtains that 

n i n 

J2\(Aei,ei) - -E^ 4e l’ e l)l P = Tr 

i=i l=i 

which is what we intended to have. □ 



/ 1 \ 

P 

i 

Tr 

E c [A - -Tr A) 

< Tr 

A -Tr A 


\ n J 


n 
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zoltAn leka 


Lastly, the next corollary gives some information about the spread of Hermi- 
tians and normal matrices in terms of the statistical dispersions of their diagonal 
elements. 

Corollary 3. Let 1 < p < oo. Let A G M n (C) be a normal matrix. Then 





Moreover, if A is Hermitian then 
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